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Abstract 

Background: t\Autat\or\s in PKHDl cause autosomal recessive Caroli disease, which is a rare congenital disorder involving 
cystic dilatation of the intrahepatic bile ducts. However, the mutational spectrum of PKHDl and the phenotype-genotype 
correlations have not yet been fully established. 

Methods: Whole exome sequencing (WES) was performed on one twin sample with Caroli disease from a Chinese family 
from Shandong province. Routine Sanger sequencing was used to validate the WES and to carry out segregation studies. 
We also described the PKHDl mutation associated with the genotype-phenotype of this twin. 

Results: IK combination of WES and Sanger sequencing revealed the genetic defect to be a novel compound heterozygous 
genotype in PKHDl, including the missense mutation c.2507 T>C, predicted to cause a valine to alanine substitution at 
codon 836 (c.2507T>C, p.Val836Ala), and the nonsense mutation C.23410T, which is predicted to result in an arginine to 
stop codon at codon 781 (C.23410T, p.Arg781^). This compound heterozygous genotype co-segregates with the Caroli 
disease-affected pedigree members, but is absent in 200 normal chromosomes. 

Conclusions: Oux findings indicate exome sequencing can be useful in the diagnosis of Caroli disease patients and associate 
a compound heterozygous genotype in PKHDl with Caroli disease, which further increases our understanding of the 
mutation spectrum of PKHDl in association with Caroli disease. 
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Introduction 

Caroli disease is a rare and complex autosomal congenital 
disorder that presents as cystic dilatation of the intrahepatic bile 
ducts [1]. It most commonly manifests as jaundice, cirrhosis, and 
dilatation of renal tubules, as well as renal impairment in children 
with associated multicystic or polycystic kidney disease in the 
second to third decades of life [2,3]. It can also lead to persistent 
recurrent cholangitis caused by cholestasis, and, if left untreated, 
patients will eventually develop biliary cirrhosis, portal hyperten- 
sion [4], and sometimes bile duct carcinoma [5]. The main mode 
of Caroli disease inheritance is an autosomal recessive form [6]. 
Since the causative mutation for autosomal recessive polycystic 
kidney disease (ARPKD), a major cause of renal and liver-related 
morbidity and mortality in neonates and infants [7], was 
identified, several cases of ARPKD with liver manifestations, 
including Caroli disease, have been reported. Mutations in 
polycystic kidney and hepatic disease gene 1 [PKHDl) are 
responsible for Caroli disease [8], and many causative mutations 
are known [9,10,11]. 

PKHDl is located on chromosome 6pl2.3- 6pl2.2 and contains 
a 16.2 kb coding sequence divided into 66 exons, separated by 



introns varying in size up to 472 kb [12,13]. It encodes 
fibrocystin/polyductin (FPC), a type of membrane-associated 
receptor-like protein [14,15] that is predominantly expressed in 
the apical domain of renal tubule epithelial cells, and may play an 
important role in collecting duct and biliary differentiation [16]. 
Recently, advances in next generation sequencing technologies 
have enabled whole exome sequencing (WES) to become a 
technically feasible and powerful tool for identifying pathogenic 
mutations in various Mendelian disorders [17], including rare 
diseases [18,19]. This is especially adaptable for the detection of 
PKHDl (including 66 exon) mutations in families with Caroli 
disease. 

To date, the function of PKHD 1 is still unclear, and we know 
little about the relationship between the PKHDl genotype and 
clinical phenotype in Caroli disease. In the present study, 
therefore, we investigated a Chinese twin family with Caroli 
disease to detect PKHDl mutations using WES, and to evaluate 
the clinical phenotype correlation associated with these mutations. 
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Materials and Methods 

Subjects 

The subjects were from a dizygotic male twin family in 
Shandong province, China. The proband (Patient A, born as the 
second twin), a 10-year-old boy, the first twin (Patient B), and their 
parents underwent detailed clinical and ultrasonographical exam- 
inations. Both twins were clinically diagnosed with Caroli disease. 

The study protocol was approved by the Human Ethics 
Committee of the Affiliated Hospital of Medical College, Qingdao 
University (Shandong, China) and is compliant with the Code of 
Ethics of the World Medical Association and informed consent 
was obtained. The parents of the subjects in this manuscript have 
given written informed consent (as outlined in PLOS consent 
form) to publish these case details. Blood samples were collected 
from the twins and their parents. DNA was extracted with a 
standard phenol-chloroform extraction procedure, consisting of 
the lysis of white blood cells, followed by protein digestion, 
extraction of the DNA with phenol-chloroform, and precipitation 
of DNA with isopropanol. 

Whole exome sequencing 

WES was carried out on the proband using human exome 
capture, which was performed according to the protocol from 
lUumina's TruSeq Exome Enrichment Guide (SureSelectXT 
Target Enrichment System for lUumina Paired-End Sequencing 
Library, Agilent). The Agilent Human All Exon 50 Mb Exome 
Enrichment kit was used as exome enrichment probe sets. 
Genomic DNA libraries were prepared according to the manu- 
facturer's instructions (lUumina, San Diego, CA). Briefly, 5 |ig of 
genomic DNA in 80 |li1 of EB buffer was fragmented in a 
Bioruptor (Diagenode) to 100—500 bp fragments. DNA fragments 
between 150-250 bp were recovered by gel extraction, then end 
repair and size selection procedure were performed by T4 DNA 
poly and Klenow poly cleave 3 ' . An 'A' base was added to the 3 ' 
end using Klenow 3' to 5' exo minus, then DNA fragments were 
ligated to the lUumina multi-PE-adaptor. PGR amplification using 
1 2 cycles was subsequently carried out of the DNA product by 
mixing it with 1 |Lil of lUumina multi-PE primer #1 (25 |iM), 1 |il 
of lUumina multi-PE primer #2 (0.5 |LiM), and 1 |Lil of lUumina 
index primer (25 |iM). 

Captured DNA libraries were sequenced with lUumina HiSeq 
2000, which yielded 200 (2x100) bp from the fmal library 
fragments using V2 reagent. Base calling was performed by 1.8 
software (lUumina; data after 22"^ June, 2011). The sequence 
reads obtained were aligned to the human genome reference 
sequence (NCBI36/hgl8), and variations were identified using the 
software tool supplied with the instrument. Finally we got 62.09 M 
high quality reads, and 44.85 M were mapped to the reference 
genome, the mean depth of the target region was 1 14.83 x. 
Targeted bases with at least 50 x was 75.81%, 20x82.23%, 
10x89.04%, 4x93.56%, 1x96.09%. Based on these general 
statistics, we performed further analysis. All identified PKHDl 
variations were annotated with information to identify candidate 
mutations displaying the depth of coverage, conservation across 
species, percentage of reads with the variant, novelty, potential 
splice site alteration, and likelihood that a variation is deleterious 
to the protein. This information was extracted from reference data 
sets or computed in bulk for all variations. 

Mutation analysis and confirmation 

Variants of PKHDl identified by exome sequencing were 
confirmed using Sanger sequencing. Two fragments covering the 
coding sequence and the flanking intronic sequence of PKHDl 



(MIM# 606702, GenBank NM_1 38694.3) were amplified using 
PKHDl primer pairs for exon 23 (Forward: 5'-CTCCCTTACT- 
GAGTTTCC-3' and Reverse: 5'-AACAATAAGTCCCTTTCC- 
3') and exon 24 (Forward: 5'-GATGAAACTCTGTAAGGTG- 
GAT-3' and Reverse: 5'-GGAAGGGAGATGTTGGGT-3'). 
Identical amplification conditions were used for both primer pairs 
in a total volume of 25 |li1 containing 250 nM dNTPs, 100 ng of 
template DNA, 0.5 |LiM of each primer, and 1.25 U AmpliTaq 
Gold DNA polymerase in 1 x reaction buffer (10 mM Tris HGl, 
pH 8.3, 50 mM KCl, 2.5 mM MgClg). PGR amplifications were 
performed with an initial denaturing step at 94°C for 5 min, then 
35 cycles of: 94°C for 30 s, 59°C or 63°C (for exons 23 and 24, 
respectively) for 60 s, 72°C for 30 s, followed by 10 min of final 
extension at 72°C. Amplified PGR products were purified and 
sequenced using the appropriate PGR primers and the BigDye 
Terminator Cycle Sequencing kit (AppUed Biosystems, Foster 
City, CA) and run on an automated sequencer, ABI 3730XL 
(Applied Biosystems) to perform mutational analysis. 

Denaturing high-performance liquid chromatography 
(DHPLC) screening of the PKHDl mutation 

Mutation screening of the fragments harboring the c.2341 OT 
and C.2507 T>C mutations in exons 23 and 24 of PKHDl was 
performed with denaturing high-performance liquid chromatog- 
raphy (DHPLC Wave DHPLC; Transgenomic, Omaha, NE) in 
100 normal controls. DHPLC used an initial concentration of 
48% buffer A (0.1 M triethylammonium acetate (TEAA; Trans- 
genomic) and 52% buffer B (0.1 M TEAA containing 25% 
acetonitrile; Transgenomic) at 65 °C. Data were analyzed by 
comparing the chromatograms. 

Results 

Clinical phenotype 

The proband (Patient A) was healthy until the age of 1 year, 
when he was found to have asymptomatic splenomegaly. At 5 
years of age he presented with anorexia and an upper abdominal 
mass and was diagnosed with Caroli disease. After 2 years, mild 
jaundice was apparent on the systemic skin, and the abdominal 
mass had increased, causing upper abdominal pain. This was 
accompanied by liver cirrhosis, hypersplenism, severe anemia, and 
a polycystic kidney. The first twin (Patient B) had no obvious 
clinical symptoms except for intrahepatic bile duct dilatation. Both 
parents were negative for the presence of liver and renal anomalies 
as shown by an ultrasonography, were non-consanguineous, and 
had no family history of genetic diseases. 

Mutation analysis 

As PKHDl is a long gene, composed of 66 exons, we performed 
WES and applied several filtering steps to exclude nongenetic 
variants by filtering the database of dbSNP and 1000 genomes to 
select for nucleotide changes predicted to have a damaging effect 
on the PKHD 1 protein by SIFT (Sorting intolerant from tolerant) 
and PolyPhen-2. The depth of coverage for c.2341 OT and 
c.2507 T>C mutations in exons 23 and 24 PKHDl are 77 x and 
105x, which suggest high reliability of sequencing. Sanger 
sequencing confirmation of the proband revealed a compound 
heterozygous genotype, based on a novel missense variant, c.2507 
T>C (p.Val836Ala, SIFT score 0.02, PolyPhen-2 score 0.998), 
predicted to cause a valine to alanine substitution at codon 836 in 
exon 24 (accession no. NM_1 38694.3; first nucleotide of the 
initiation codon numbered 1), and a known nonsense mutation, 
C.23410T (p.Arg781'^), predicted to change an arginine to a stop 
codon at codon 781 in exon 23 (Figures 1, 2). Mutation p.Arg781* 
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Figure 1. Partial sequence of exon 23 in the PKHDl from member of this Caroli disease-affected pedigree, (a), (b) and (d) Arrowhead 
indicates the heterozygous C and T at nucleotide 2341 in proband, elder twins and their mother respectively, (c) Arrowhead indicates the C at 
nucleotide 2341 (wild type) in their father. 
doi:1 0.1 371 /journal. pone.0092661.g001 



has previously been described in Caucasian-American patients 
and those from the Netherlands, France, Denmark, Germany, 
Portugal, and Belgium. 

Go-segregation analysis of this pedigree revealed that the first 
twin also carries the same compound heterozygous genotype, and 
that both parents are carriers of a single heterozygous mutation. 
The father harbors the p.Val836Ala mutant, while the mother 
carries the p.ArglQl^ variant. These two different missense 
variants were individually inherited from both parents, resulting 
in the compound heterozygous genotype co-segregating with the 
Garoli disease-affected pedigree in both twins. 

Gomparing the WES findings with the different phenotypes of 
the twins, we determined whether any variants in genes related to 
Garoli disease, hepatic function, or kidney function could act as a 
genetic modifier for Garoli disease (such as JVPHP3, PKDl and so 
on). However, no positive results were identified. 

DHPLC screening of the PKHDl mutation 

Analysis of 200 normal chromosomes from 100 healthy controls 
of Ghinese Han origin by DHPLG found no evidence of the novel 
missense variant, c.2507T>G (p.Val836Ala), or the p.ArgTSl'^ 
variant (Figure 3). This suggests that the compound heterozygous 
genotype observed in this family is causative of the Garoli disease 
phenotype. 

Bioinformatic analysis of PKHDl mutation 

We obtained PKHDl family protein sequences from NGBI and 
UGSG websites and used Vector NTI software to perform 
multiple-sequence alignments in various animal species, including 



Mus musculus, Rattus norvegicus, Pan paniscus, Xenopus (Silurana) 
tropicalis, Falco cherrug^ ^onotrichia albicollis and Homo sapiens 
(Figure 4). The p.Val836Ala variant was found to be located in 
a highly conserved region of the PKHD 1 protein. 

Discussion 

Garoli disease, which has an estimated incidence of approxi- 
mately 1 in 100,000 newborns, is a complex disorder of the 
intrahepatic bile ducts presenting with multiple saccular segmental 
and cystic dilatations [20]. When progressive, it can cause 
recurrent cholangitis, jaundice, the accumulation of intrahepatic 
stones, portal hypertension, liver failure, and even cholangiocarci- 
noma [4,21]. The pathogenesis of Garoli disease appears to be 
related to dilatations and malformation of a ductal plate, which are 
either diffuse or confined to only one part of the liver [22]. It can 
be divided into the pure form of Garoli disease and Garoli' s 
syndrome, which presents with repeated bouts of cholangitis 
resulting from bile stasis, hepatolithiasis, gall bladder stones, and 
symptoms associated with hepatic fibrosis such as portal hyper- 
tension and poor hepatic reserve. The disease spectrum of clinical 
phenotypes caused by mutations in PKHDl is relatively complex, 
ranging from perinatally-fatal ARPKD to Gongenital Hepatic 
Fibrosis (GHF)-predominant presentations in adulthood with mild 
or no apparent kidney disease [23]. 

In 2002, Ward and colleagues first screened PKHDl mutations 
in 1 4 probands with ARPKD (some cases of ARPKD with mainly 
liver manifestations, including Garoli disease diagnosed in 
adulthood), and revealed that eight of the affected individuals 
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Figure 2. Partial sequence of exon 24 in the PKHDl from member of this Caroli disease-affected pedigree, (b), (c) and (d) Arrowhead 
indicates the heterozygous T and C at nucleotide 2507 in proband, elder twins and their father respectively, (a) Arrowhead indicates the T at 
nucleotide 2507 (wild type) in their mother. 
doi:1 0.1 371 /journal. pone.0092661 .g002 
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Figure 3. DHPLC shows wave pattern of wild type and mutant 
type PKHDl, Wild type: from normal individuals. Mutant type : from 
affected individuals. 
doi:1 0.1 371 /journal. pone.0092661. g003 



were compound heterozygotes [16]. Since then, several large-scale 
mutation detection studies have focused on the longest PKHDl 
ORF (open reading frame) [24,25,26]. Mera et al. used direct 
sequencing to detect PKHDl mutations in a cohort of 90 North 
American ARPKD/CHF patients, and identified 77 PKHDl 
sequence variants, which supported previously published geno- 
type-phenotype correlation findings [23]. Sandro et al. used 
DHPLC to report a compound heterozygous genotype 
(c.l0364delG/p.Ile 3468Val) PKHDl in a 36-year-old female 
with Caroli disease [27]. To date, at least 300 different pathogenic 
mutations have been found throughout most of the coding exons 
of the human PKHDl gene (http://www.humgen.rwth-aachen. 
de/). Approximately 60% of these are truncating mutations and 
40% are missense mutations, suggesting that the recessive form of 
this disease results from loss of function of the normal protein in 
different degree [28]. 

Although severely affected ARPKD and CHF patients account 
for most known PKHDl mutations, and patients with Caroli 
disease have a low rate of PKHDl mutation detection [27], the 
genotype-phenotype associated with PKHDl mutations is relatively 
complex. ARPKD patients carrying two truncating mutations 
have a severe disease phenotype resulting in perinatal death, while 
other combinations of mutations, such as splicing and missense 
mutations, have a more variable but usually less severe phenotype 
[29]. Meral et al. analyzed the clinical, molecular, PKHDl 
mutation, and imaging data of 73 patients with ARPKD and 
CHF (including 51 with Caroli syndrome). Although biallelic 
PKHDl mutations were identified in only 43 families and one 
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Figure 4. Multiple-sequence alignment of the PKHDl protein including Mus musculus, Rattus norvegicus, Pan paniscus, Xenopus 
(Silurana) tropicalis, Faico cherrug, Zonotrichia albicollis and Homo sapiens. The Val 836 residue is located within a highly conserved region. 
doi:l 0.1 371 /journal. pone.0092661.g004 



heterozygous mutation in 20 families, the authors concluded that 
kidney and liver disease are independent, and that variability in 
severity does not reflect the type of PKHDl mutation [11]. 

In this work, we identified a causative compound heterozygous 
PKHDl mutation in a Chinese twin family with Caroli disease. 
Because of the large size of PKHDl ^ DHPLC or single-strand 
conformation polymorphism (SSCP) screening techniques have 
been used in all but one of the published studies; variants detected 
by screening were further characterized by targeted direct 
sequencing. However, DHPLC and SSCP have their own 
shortcomings, so, more recently, next generation sequencing 
technologies have been employed to rapidly accelerate the 
discovery of the genetic causes of human diseases. WES is a 
powerful tool for investigating the genetic underpinnings of human 
disease [30]. As Mendelian pathogenic mutations are frequently 
exonic, exome sequencing is an efficient method to simultaneously 
examine many coding regions, and has rapidly proven to be an 
important tool in genetic research [31]. It is especially adapted to 
large genes such as PKHDl ^ as it not only overcomes the time- 
consuming and laborious nature of traditional PCR but also has 
relatively lower costs. Therefore, we performed WES to screen 
PKHDl mutations. 

We found a novel compound heterozygous genotype of PKHDl 
(p.Val836Ala/p.Arg781'^) in twin of a Chinese family with Caroli 
disease. Their father harbors the p.Val836Ala mutant, while their 
mother carries the p.ArgTBl'^ variant. The p.Val836Ala mutation 
results in an amino acid change from valine to alanine, but as both 
amino acids are neutral the potential impairment on protein 
function is unclear; however, the amino acid at position 836 is 
highly conserved between species. The p.ArgTBl'^ variation leads 
to a truncated PKHDl protein, which lacks the succeeding seven 
IPT domains and two G8 domains, thus losing the functionality of 
the wild-type protein. Zerres et al. firstly reported p.ArgyBl'^ in 



ARPKD [7], showing that the same mutation can cause different 
clinical phenotypes. 

As seen in the present family, the proband's clinical phenotype 
of Caroli syndrome included liver cirrhosis, hypersplenism, severe 
anemia, and a polycystic kidney, while the first twin suffered from 
pure Caroli disease. Therefore, the genotype-phenotype associated 
with PKHDl mutations is complex. Sgro et al. [9] reported the 
compound heterozygous genotype PKHDl IVS55+lG^A/p.Trp 
2690Arg in patients detected prenatally with Caroli disease; 
IVS55+1G^A was inherited from the father and the missense 
mutation p.Trp2690Arg from the mother, which is consistent with 
a recessive pattern of inheritance. Our findings also lay the basis 
for a more accurate and rapid prenatal diagnosis of Caroli disease 
in its early stages. 

In conclusion, we combined WES with Sanger sequencing to 
report a novel compound heterozygous genotype of PKHDl 
causative of Caroli disease in a Chinese twin family. The 
characteristic disease phenotype shows obvious differences be- 
tween the twins. Our study enlarged the genotype-phenotype 
correlations of PKHDl ^ which might be useful for understanding 
the pathophysiological mechanisms of Caroli disease. 
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